Since 2009, there have been 278 mass shootings in the United States, resulting in 1569 people shot and killed and 1000 people shot and wounded. The purpose of this analysis is to explore correlations between public mass shootings versus unemployment and household median income. This could help public officials deploy more effective public policies based on observations seen in the data set. Hopefully helping address future incidents from happening.
https://www.motherjones.com/politics/2012/12/mass-shootings-mother-jones-full-data/
https://www.icip.iastate.edu/tables/employment/unemployment-states
https://geodacenter.github.io/data-and-lab//houston/
I would like to develop this hypothesis to enable the United States Government to better strategize and improve the understanding of public mass shooting incidents.
This study should enable the US government to address trends of shooter background to target social policy more effectively.
I will also compare the incidents by Region Unemployment Rate of State
## case location date summary
## Length:126 Length:126 Length:126 Length:126
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## fatalities injured total_victims location.1
## Min. : 3.000 Min. : 0.00 Min. : 3.00 Length:126
## 1st Qu.: 4.000 1st Qu.: 1.00 1st Qu.: 6.25 Class :character
## Median : 6.000 Median : 3.00 Median : 10.00 Mode :character
## Mean : 7.968 Mean : 11.58 Mean : 19.55
## 3rd Qu.: 8.750 3rd Qu.: 10.00 3rd Qu.: 17.00
## Max. :58.000 Max. :546.00 Max. :604.00
## age_of_shooter prior_signs_mental_health_issues mental_health_details
## Length:126 Length:126 Length:126
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
## weapons_obtained_legally where_obtained weapon_type
## Length:126 Length:126 Length:126
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
## weapon_details Race Gender sources
## Length:126 Length:126 Length:126 Length:126
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## mental_health_sources sources_additional_age latitude longitude
## Length:126 Length:126 Min. :21.32 Min. :-157.88
## Class :character Class :character 1st Qu.:33.80 1st Qu.:-117.56
## Mode :character Mode :character Median :38.32 Median : -90.87
## Mean :37.38 Mean : -96.84
## 3rd Qu.:41.47 3rd Qu.: -81.51
## Max. :48.46 Max. : -71.08
## type year
## Length:126 Min. :1982
## Class :character 1st Qu.:2000
## Mode :character Median :2013
## Mean :2009
## 3rd Qu.:2017
## Max. :2022
The chart below shows an increase in fatality over the last few decades.
The Boxplot also shows us an increase in fatalities in recent years and an average fatality rate of 5-10 deceased in a shooting.| Variable | N | Mean | Std. Dev. | Min | Pctl. 25 | Pctl. 75 | Max |
|---|---|---|---|---|---|---|---|
| fatalities | 126 | 7.968 | 7.721 | 3 | 4 | 8.75 | 58 |
| injured | 126 | 11.579 | 49.1 | 0 | 1 | 10 | 546 |
| total_victims | 126 | 19.548 | 54.514 | 3 | 6.25 | 17 | 604 |
| Race | 126 | ||||||
| … Asian | 8 | 6.3% | |||||
| … Black | 21 | 16.7% | |||||
| … Latino | 10 | 7.9% | |||||
| … Native American | 3 | 2.4% | |||||
| … Unidentified | 17 | 13.5% | |||||
| … White | 67 | 53.2% | |||||
| Gender | 126 | ||||||
| … F | 5 | 4% | |||||
| … M | 121 | 96% | |||||
| latitude | 126 | 37.384 | 5.623 | 21.32 | 33.8 | 41.468 | 48.461 |
| longitude | 126 | -96.842 | 17.914 | -157.876 | -117.556 | -81.508 | -71.076 |
| type | 126 | ||||||
| … Mass | 107 | 84.9% | |||||
| … Spree | 19 | 15.1% | |||||
| year | 126 | 2008.968 | 10.563 | 1982 | 2000.25 | 2017 | 2022 |
Fig. State Unemployment vs State Fatalities X-squared = 5.1174, df = 1, p-value = 0.02369 Fig. Mental Health vs State Fatalities: X-squared = 0.00043219, df = 1, p-value = 0.9834 Fig. Weapon Legality vs State Fatalities: X-squared = 0.50809, df = 1, p-value = 0.476 Fig. Gender vs Race: X-squared = 10.855, df = 5, p-value = 0.05433
## `geom_smooth()` using formula 'y ~ x'
## [1] 38.83755 39.68663 34.11165 31.92597 39.75731 31.77107 40.78514 27.47104
## [9] 40.44390 38.99455 29.39282 36.05252 26.30483 40.05215 40.01876 29.27328
## [17] 39.87637 36.09574 41.52955 39.95903 42.23669 38.88103 32.78839 34.43628
## [25] 34.00862 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 33.74118
## [33] 32.33594 35.33343 37.76721 41.92947 41.26572 45.57191 40.76065 39.95890
## [41] 47.62290 34.42557 43.06057 39.96226 32.41084 42.50043 32.66451 33.85012
## [49] 39.60403 44.04624 35.82099 41.68563 47.61864 35.05299 37.79297 42.38106
## [57] 39.07869 42.48948 31.11712 38.25424 37.95770 37.36883 28.03319 35.66720
## [65] 32.55200 32.92517 25.79649 38.60111 42.84411 37.31610 33.83542 39.98696
## [73] 37.21043 30.36471 36.99719 34.17695 28.58029 48.46137 28.51972 34.07596
## [81] 43.28954 35.04716 31.13556 27.82803 48.05082 41.48710 47.87635 41.84767
## [89] 35.34939 39.10198 26.07275 44.20412 31.14172 25.86434 39.16380 27.96648
## [97] 33.78779 43.04451 40.70736 36.75442 41.75373 39.45566 39.45219 36.74638
## [105] 30.43360 32.78011 38.13599 38.87498 47.31296 41.79876 47.15277 38.58009
## [113] 41.90816 33.55986 26.11927 39.67560 40.72677 30.33218 38.39250 37.76595
## [121] 37.80438 33.94121 42.09980 37.22957 21.32006 41.66069
## [1] -104.81425 -86.32313 -84.58038 -102.27960 -84.18495 -106.37565
## [7] -77.83941 -81.45847 -79.92140 -76.54366 -95.14197 -86.61694
## [13] -80.26951 -79.38917 -122.39309 -98.05649 -104.98613 -115.17154
## [19] -75.94722 -82.59651 -85.67480 -104.84906 -79.93314 -119.87144
## [25] -118.49475 -74.98489 -73.31142 -93.31041 -87.86314 -104.82059
## [31] -122.33006 -118.10464 -110.97513 -79.41459 -87.55737 -88.75036
## [37] -96.06749 -88.90289 -111.89109 -76.08060 -122.31650 -119.86607
## [43] -88.10648 -83.00071 -88.63454 -71.07591 -97.38425 -84.37784
## [49] -105.07410 -123.02203 -90.66826 -72.72984 -117.64836 -78.87871
## [55] -122.39797 -76.87058 -121.54758 -83.14465 -97.72780 -85.75941
## [61] -121.29078 -122.03635 -80.64297 -97.42937 -117.04308 -96.83868
## [67] -80.22668 -121.41897 -83.25993 -121.88853 -117.85379 -105.25117
## [73] -93.23686 -87.28857 -121.58482 -118.87479 -81.29409 -122.33792
## [79] -81.37678 -117.27789 -123.33319 -85.31182 -97.78366 -97.54820
## [85] -122.17692 -120.54224 -95.01694 -87.62201 -118.91634 -84.51178
## [91] -80.14338 -88.46754 -97.77756 -80.31177 -119.76740 -82.57059
## [97] -117.85311 -87.96254 -74.08361 -76.06038 -88.33106 -76.20848
## [103] -76.30999 -119.80032 -91.08140 -96.80001 -97.42515 -76.99453
## [109] -122.33937 -72.57007 -122.46731 -90.40691 -87.87991 -81.72195
## [115] -80.10412 -104.84485 -73.63430 -81.65565 -122.36653 -122.40609
## [121] -122.27082 -84.21353 -75.91772 -80.41394 -157.87646 -91.53022
In order to visualize the clustering of incidents we join our data set by region to the map of the US. As we can see from the spatial mapping below the highest incidents of shootings have taken place in Nevada and Texas.
On observing the point clustering there seemed to visually be a few zones/ states in which the clustering of shootings was high. I wanted to see visually if there was a relationship between the shootings and demographic factors such as unemployment.
Joining shooting data with base map data:
First visualization of shooter data on base map plot:
##Visualizing clustering of incidents # Point Plot of shooting location:
##Visualizing clustering of incidents # Merging point plot with base map with labelling:
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_text_repel).
## Warning: ggrepel: 91 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
Combining Fatality gradient with shooter locations:
## Warning: ggrepel: 125 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## Warning: ggrepel: 73 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
Below we see a map with a gradient of unemployment levels within states and the location of the shootings. To the eye it seems like there was a relationship between shootings and the unemployment levels.
It is observed that the lighter states (high unemployment) is where most of the clusters lie as compared to the darker states (low unemployment)
## Warning: ggrepel: 89 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
The Variogram Analysis for the dataset unfortunately didn’t show a trend of spatial correlation. Leading to the selection of another dataset for further observation. The dataset selected shows the Homocide rates in the states with one of the highest Mass Shooting Fatality rates- Texas.
## np dist gamma dir.hor dir.ver id
## 1 248 1.029397 37.59879 0 0 var1
## 2 312 3.058207 68.24840 0 0 var1
## 3 394 5.045395 50.23096 0 0 var1
## 4 412 7.069526 78.65170 0 0 var1
## 5 470 9.108588 40.17872 0 0 var1
## 6 495 11.127538 73.86162 0 0 var1
## 7 529 13.203104 51.17108 0 0 var1
## 8 524 15.068132 62.92653 0 0 var1
## 9 525 17.183127 58.52952 0 0 var1
## 10 308 19.099637 58.01948 0 0 var1
## 11 278 21.183013 48.36871 0 0 var1
## 12 201 23.165882 57.58209 0 0 var1
## 13 290 25.313153 85.93621 0 0 var1
## 14 253 27.308062 88.89723 0 0 var1
## 15 251 29.197187 59.01594 0 0 var1
## Warning in fit.variogram(Vario_shootings, vgm(0.013, "Exp", 100, 1)): No
## convergence after 200 iterations: try different initial values?
For the first variable we have the Moran’s I coefficient at -0.02075940. While -1 is clustering of dissimilar values or perfect dispersion and 0 is no autocorrelation or perfect randomness. This tells us that the fatality in a shooting has little no clustering with their neighbors. This makes sense as the fatality of one shooting to another does not necessarily have a geographical correlation to other shootings.
For the second variable we have the Moran’s I coefficient at 0.003926306 . While 1 is perfect clustering of similar values or perfect dispersion and 0 is no autocorrelation or perfect randomness. This tells us that the injured in a shooting is still pretty random and has no consequence or correlation to what happens in a closest neighbor.
For the third variable we have the Moran’s I coefficient at 0.08664016 While 1 is perfect clustering of similar values and 0 is no autocorrelation or perfect randomness. This tells us that the age of a shooter is still pretty random but has slightly more correlation as compared fatality and injuries of a shooting. The coefficient is not conclusive but begins to wonder if there are any demographic influences causing people of a certain age in a region to act a certain way. To know more we would need a better dataset.
The coefficients indicate that the US shooting incidences show low autocorrelation to one another. It could be because of the premise of the data set is mass or spree shootings which implies more than 4 fatalities. It would be interesting to combine fatalities data with homicide rate data to be able to see a larger data bank of crime activity. For the moment our data suggests spatial independence.
## [1] FALSE
##
## Moran I test under randomisation
##
## data: SHOOTINGS$fatalities
## weights: nb2listw(SHOOTINGS_nb, style = "W", zero.policy = TRUE) n reduced by no-neighbour observations
##
##
## Moran I statistic standard deviate = 0.019393, p-value = 0.4923
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## -0.02075940 -0.02325581 0.01657035
## [1] TRUE
##
## Moran I test under randomisation
##
## data: SHOOTINGS$injured
## weights: nb2listw(SHOOTINGS_nb, style = "W", zero.policy = TRUE) n reduced by no-neighbour observations
##
##
## Moran I statistic standard deviate = NaN, p-value = NA
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.003926306 -0.023255814 -0.059827141
## [1] 39.68663 39.98696 41.75373 27.47104 34.17695 39.10198 36.05252 38.39250
## [9] 26.30483 37.76595 26.07275 44.20412 34.43628 38.87498 34.00862 47.31296
## [17] 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 37.80438 33.94121
## [25] 33.74118 39.16380 32.33594 41.79876 47.15277 42.09980 35.33343 37.76721
## [33] 41.92947 38.58009 41.26572 37.22957 39.95890 47.62290 34.42557 47.87635
## [41] 43.06057 39.96226 32.41084 41.90816 42.50043 27.96648 21.32006 32.66451
## [49] 33.85012 39.60403 44.04624 35.82099 41.68563 33.78779 33.55986 26.11927
## [57] 27.82803 47.61864 40.72677 35.05299 37.79297 42.38106 39.07869 42.48948
## [65] 31.11712 30.33218 38.25424 37.95770 37.36883 28.03319 32.55200 32.92517
## [73] 25.79649
## [1] -86.32313 -105.25117 -88.33106 -81.45847 -118.87479 -84.51178
## [7] -86.61694 -122.36653 -80.26951 -122.40609 -80.14338 -88.46754
## [13] -119.87144 -76.99453 -118.49475 -122.33937 -74.98489 -73.31142
## [19] -93.31041 -87.86314 -104.82059 -122.33006 -122.27082 -84.21353
## [25] -118.10464 -119.76740 -110.97513 -72.57007 -122.46731 -75.91772
## [31] -79.41459 -87.55737 -88.75036 -90.40691 -96.06749 -80.41394
## [37] -76.08060 -122.31650 -119.86607 -95.01694 -88.10648 -83.00071
## [43] -88.63454 -87.87991 -71.07591 -82.57059 -157.87646 -97.38425
## [49] -84.37784 -105.07410 -123.02203 -90.66826 -72.72984 -117.85311
## [55] -81.72195 -80.10412 -97.54820 -117.64836 -73.63430 -78.87871
## [61] -122.39797 -76.87058 -121.54758 -83.14465 -97.72780 -81.65565
## [67] -85.75941 -121.29078 -122.03635 -80.64297 -117.04308 -96.83868
## [73] -80.22668
## [1] TRUE
##
## Moran I test under randomisation
##
## data: data_shootings2$age_of_shooter
## weights: nb2listw(data_shootings2_nb, style = "W", zero.policy = TRUE) n reduced by no-neighbour observations
##
##
## Moran I statistic standard deviate = 0.49111, p-value = 0.3117
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.08664016 -0.04761905 0.07473523
An omnidirectional semivariogram is computed and modelled to explore the overall spatial continuity of the dataset. With increasing spatial separation ( h ), the correlation between two data points decreases. The range is the distance at which two data points become independent.
A general trend of increased or decreased semivariance was not observed. This would be due to the fact of randomness in the shooting data set or spatial independence.
A slight trend of increased semivariance was observed initially but the eventual decrease again suggests spatial independence. This would be due to the fact of randomness in the shooting data set.
Overall the semivariance across distance fluctuated between a semivariance range of 125 - 175.
Overall we noticed similar results to Morans I where there was no real observed trend of of semivariance with respect to distance.
## [1] 39.68663 39.98696 41.75373 27.47104 34.17695 39.10198 36.05252 38.39250
## [9] 26.30483 37.76595 26.07275 44.20412 34.43628 38.87498 34.00862 47.31296
## [17] 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 37.80438 33.94121
## [25] 33.74118 39.16380 32.33594 41.79876 47.15277 42.09980 35.33343 37.76721
## [33] 41.92947 38.58009 41.26572 37.22957 39.95890 47.62290 34.42557 47.87635
## [41] 43.06057 39.96226 32.41084 41.90816 42.50043 27.96648 21.32006 32.66451
## [49] 33.85012 39.60403 44.04624 35.82099 41.68563 33.78779 33.55986 26.11927
## [57] 27.82803 47.61864 40.72677 35.05299 37.79297 42.38106 39.07869 42.48948
## [65] 31.11712 30.33218 38.25424 37.95770 37.36883 28.03319 32.55200 32.92517
## [73] 25.79649
## [1] -86.32313 -105.25117 -88.33106 -81.45847 -118.87479 -84.51178
## [7] -86.61694 -122.36653 -80.26951 -122.40609 -80.14338 -88.46754
## [13] -119.87144 -76.99453 -118.49475 -122.33937 -74.98489 -73.31142
## [19] -93.31041 -87.86314 -104.82059 -122.33006 -122.27082 -84.21353
## [25] -118.10464 -119.76740 -110.97513 -72.57007 -122.46731 -75.91772
## [31] -79.41459 -87.55737 -88.75036 -90.40691 -96.06749 -80.41394
## [37] -76.08060 -122.31650 -119.86607 -95.01694 -88.10648 -83.00071
## [43] -88.63454 -87.87991 -71.07591 -82.57059 -157.87646 -97.38425
## [49] -84.37784 -105.07410 -123.02203 -90.66826 -72.72984 -117.85311
## [55] -81.72195 -80.10412 -97.54820 -117.64836 -73.63430 -78.87871
## [61] -122.39797 -76.87058 -121.54758 -83.14465 -97.72780 -81.65565
## [67] -85.75941 -121.29078 -122.03635 -80.64297 -117.04308 -96.83868
## [73] -80.22668
library(rgdal)
spat.data = readOGR("NAT.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/samarthgwalani/Desktop/Harrisburg/Analysis-555/Finals/NAT.shp", layer: "NAT"
## with 3085 features
## It has 70 fields
## Integer64 fields read as strings: STFIPS COFIPS FIPSNO
names(spat.data) #show variable names
## [1] "LAG1" "NAME" "STATE_NAME" "STATE_FIPS" "CNTY_FIPS"
## [6] "FIPS" "STFIPS" "COFIPS" "FIPSNO" "SOUTH"
## [11] "HR60" "HR70" "HR80" "HR90" "HC60"
## [16] "HC70" "HC80" "HC90" "PO60" "PO70"
## [21] "PO80" "PO90" "RD60" "RD70" "RD80"
## [26] "RD90" "PS60" "PS70" "PS80" "PS90"
## [31] "UE60" "UE70" "UE80" "UE90" "DV60"
## [36] "DV70" "DV80" "DV90" "MA60" "MA70"
## [41] "MA80" "MA90" "POL60" "POL70" "POL80"
## [46] "POL90" "DNL60" "DNL70" "DNL80" "DNL90"
## [51] "MFIL59" "MFIL69" "MFIL79" "MFIL89" "FP59"
## [56] "FP69" "FP79" "FP89" "BLK60" "BLK70"
## [61] "BLK80" "BLK90" "GI59" "GI69" "GI79"
## [66] "GI89" "FH60" "FH70" "FH80" "FH90"
spat.data$HR70 <- as.numeric(as.character(spat.data$HR70))
spplot(spat.data,"HR70") #make map
library(spdep)
queen.nb=poly2nb(spat.data)
rook.nb=poly2nb(spat.data,queen=FALSE)
queen.listw=nb2listw(queen.nb) #convert nb to listw type
rook.listw=nb2listw(rook.nb) #convert nb to listw type
listw1= queen.listw
#define our regression equation so we don't have to type it each time
reg.eq1= HR70~HC70+FP69+GI69
reg1=lm(reg.eq1,data=spat.data)
summary(reg1)
##
## Call:
## lm(formula = reg.eq1, data = spat.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33.660 -3.832 -1.268 2.463 68.496
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.242687 1.551440 4.024 5.86e-05 ***
## HC70 0.037539 0.003293 11.399 < 2e-16 ***
## FP69 0.374423 0.021129 17.721 < 2e-16 ***
## GI69 -17.530610 4.926354 -3.559 0.000379 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.574 on 3081 degrees of freedom
## Multiple R-squared: 0.2014, Adjusted R-squared: 0.2006
## F-statistic: 258.9 on 3 and 3081 DF, p-value: < 2.2e-16
lm.morantest(reg1,listw1)
##
## Global Moran I for regression residuals
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## Moran I statistic standard deviate = 26.968, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I Expectation Variance
## 0.2891894281 -0.0006973051 0.0001155469
lm.LMtests(reg1,listw1,test=c("LMerr", "LMlag", "RLMerr", "RLMlag", "SARMA"))
##
## Lagrange multiplier diagnostics for spatial dependence
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## LMerr = 720.85, df = 1, p-value < 2.2e-16
##
##
## Lagrange multiplier diagnostics for spatial dependence
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## LMlag = 834.12, df = 1, p-value < 2.2e-16
##
##
## Lagrange multiplier diagnostics for spatial dependence
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## RLMerr = 0.018748, df = 1, p-value = 0.8911
##
##
## Lagrange multiplier diagnostics for spatial dependence
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## RLMlag = 113.29, df = 1, p-value < 2.2e-16
##
##
## Lagrange multiplier diagnostics for spatial dependence
##
## data:
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
##
## SARMA = 834.14, df = 2, p-value < 2.2e-16
library(spatialreg)
## Warning: package 'spatialreg' was built under R version 4.1.2
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Attaching package: 'spatialreg'
## The following objects are masked from 'package:spdep':
##
## get.ClusterOption, get.coresOption, get.mcOption,
## get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
## set.coresOption, set.mcOption, set.VerboseOption,
## set.ZeroPolicyOption
reg2.c=lmSLX(reg.eq1,data=spat.data, listw1)
summary(reg2.c)
##
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))),
## data = as.data.frame(x), weights = weights)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.306 -3.570 -1.259 2.247 68.109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.640049 2.321424 5.876 4.66e-09 ***
## HC70 0.035179 0.003266 10.770 < 2e-16 ***
## FP69 0.151785 0.029623 5.124 3.18e-07 ***
## GI69 -0.997493 5.741752 -0.174 0.862
## lag.HC70 0.027458 0.005699 4.818 1.52e-06 ***
## lag.FP69 0.394088 0.039838 9.892 < 2e-16 ***
## lag.GI69 -44.738152 8.670074 -5.160 2.63e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.442 on 3078 degrees of freedom
## Multiple R-squared: 0.2339, Adjusted R-squared: 0.2324
## F-statistic: 156.6 on 6 and 3078 DF, p-value: < 2.2e-16
impacts(reg2.c,listw=listw1)
## Impact measures (SlX, estimable):
## Direct Indirect Total
## HC70 0.03517931 0.02745829 0.0626376
## FP69 0.15178502 0.39408780 0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
summary(impacts(reg2.c,listw=listw1,R=500),zstats=TRUE) #Add zstats,pvals; R=500 not needed for SLX
## Impact measures (SlX, estimable, n-k):
## Direct Indirect Total
## HC70 0.03517931 0.02745829 0.0626376
## FP69 0.15178502 0.39408780 0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
## ========================================================
## Standard errors:
## Direct Indirect Total
## HC70 0.003266274 0.005698966 0.006251615
## FP69 0.029623416 0.039837611 0.029779251
## GI69 5.741751764 8.670073513 7.413703251
## ========================================================
## Z-values:
## Direct Indirect Total
## HC70 10.7704706 4.818119 10.019427
## FP69 5.1238188 9.892355 18.330643
## GI69 -0.1737262 -5.160066 -6.169069
##
## p-values:
## Direct Indirect Total
## HC70 < 2.22e-16 1.4492e-06 < 2.22e-16
## FP69 2.9941e-07 < 2.22e-16 < 2.22e-16
## GI69 0.86208 2.4686e-07 6.8693e-10
#3 - Spatial Lag Model
reg3=lagsarlm(reg.eq1,data= spat.data, listw1)
summary(reg3)
##
## Call:lagsarlm(formula = reg.eq1, data = spat.data, listw = listw1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -26.75969 -3.00971 -0.95129 1.88613 67.64144
##
## Type: lag
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.0142358 1.3705152 1.4697 0.1416
## HC70 0.0317994 0.0029064 10.9411 <2e-16
## FP69 0.1940177 0.0194051 9.9983 <2e-16
## GI69 -6.6096360 4.3444017 -1.5214 0.1282
##
## Rho: 0.51827, LR test value: 606.04, p-value: < 2.22e-16
## Asymptotic standard error: 0.020352
## z-value: 25.466, p-value: < 2.22e-16
## Wald statistic: 648.49, p-value: < 2.22e-16
##
## Log likelihood: -9881.772 for lag model
## ML residual variance (sigma squared): 33.562, (sigma: 5.7933)
## Number of observations: 3085
## Number of parameters estimated: 6
## AIC: 19776, (AIC for lm: 20380)
## LM test for residual autocorrelation
## test value: 82.747, p-value: < 2.22e-16
impacts(reg2.c,listw=listw1)
## Impact measures (SlX, estimable):
## Direct Indirect Total
## HC70 0.03517931 0.02745829 0.0626376
## FP69 0.15178502 0.39408780 0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
summary(impacts(reg2.c,listw=listw1,R=500),zstats=TRUE) #Add zstats,pvals; R=500 not needed for SLX
## Impact measures (SlX, estimable, n-k):
## Direct Indirect Total
## HC70 0.03517931 0.02745829 0.0626376
## FP69 0.15178502 0.39408780 0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
## ========================================================
## Standard errors:
## Direct Indirect Total
## HC70 0.003266274 0.005698966 0.006251615
## FP69 0.029623416 0.039837611 0.029779251
## GI69 5.741751764 8.670073513 7.413703251
## ========================================================
## Z-values:
## Direct Indirect Total
## HC70 10.7704706 4.818119 10.019427
## FP69 5.1238188 9.892355 18.330643
## GI69 -0.1737262 -5.160066 -6.169069
##
## p-values:
## Direct Indirect Total
## HC70 < 2.22e-16 1.4492e-06 < 2.22e-16
## FP69 2.9941e-07 < 2.22e-16 < 2.22e-16
## GI69 0.86208 2.4686e-07 6.8693e-10
reg4=errorsarlm(reg.eq1,data=spat.data, listw1)
summary(reg4)
##
## Call:errorsarlm(formula = reg.eq1, data = spat.data, listw = listw1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.8804 -3.1737 -1.0609 1.9782 68.8418
##
## Type: error
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.7535487 1.6277867 1.6916 0.09072
## HC70 0.0286870 0.0028887 9.9309 < 2e-16
## FP69 0.2259224 0.0239035 9.4514 < 2e-16
## GI69 -1.1442027 5.0732597 -0.2255 0.82156
##
## Lambda: 0.53997, LR test value: 547.23, p-value: < 2.22e-16
## Asymptotic standard error: 0.021007
## z-value: 25.704, p-value: < 2.22e-16
## Wald statistic: 660.71, p-value: < 2.22e-16
##
## Log likelihood: -9911.177 for error model
## ML residual variance (sigma squared): 34.024, (sigma: 5.833)
## Number of observations: 3085
## Number of parameters estimated: 6
## AIC: 19834, (AIC for lm: 20380)
#Spatial Hausman Test
Hausman.test(reg4)
##
## Spatial Hausman test (asymptotic)
##
## data: NULL
## Hausman test = 89.882, df = 4, p-value < 2.2e-16